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1.  INTRODUCTION 


A  Bayesian,  smoothness  priors  approach  is  introduced  in  this  paper  for  transfer  function  esti¬ 
mation.  Jointly  stationary  input  and  output  data  is  assumed  to  be  observed  in  the  presence  of 
additive  colored  noise.  The  method  is  particularly  relevant  when  only  short  spans  of  data  are 
available,  when  the  impulse  response  is  relatively  long-tailed  and  when  the  low  order  polynomial 
ARMAX  type  model  can  not  capture  the  true  model  structure.  The  method  is  illustrated  by  the 
analysis  of  the  Box-Jenkins  Series  J  data.  The  statistical  performance  of  the  method  is  explored  in 
Monte-Carlo  simulation  studies. 

The  models  in  Astrom  and  Bohlin  (1965)  and  Box  and  Jenkins  (1970),  are  the  classical 
parametric  time  domain  transfer  function  models.  In  that  method,  ARMAX  type  models  charac¬ 
terized  by  polynomial  operators  on  the  input,  the  output  and  the  observation  noise  are  fitted  to 
the  observed  input  and  output  data.  (The  observation  noise  in  the  Astrom  -Bohlin  model  is  MA 
noise.  It  is  AR  noise  in  the  Box-Jenkins  model.)  That  method  requires  the  specification  of 
three  polynomial  operator  orders,  one  each  for  the  input,  output  and  noise  polynomials  and  the 
estimation  of  the  unknown  polynomials  coefficients  via  the  minimization  of  a  performance  func¬ 
tional.  Typically  that  computation  is  achieved  by  a  computationally  costly  nonlinear  optimization 
procedure.  In  such  procedures  it  is  only  feasible  to  search  for  solutions  over  low  polynomial  orders. 

Despite  the  fact  that  conventional  transfer  estimation  methods  have  been  extensively  used, 
the  influence  of  the  sampling  variability  in  the  polynomial  model  order  selection  on  the  transfer 
function  estimation  performance  remains  to  be  explored.  Another  objection  to  the  use  of  low- 
order  polynomial  ARMAX  models  is  that  the  "parsimonious"  parametric  model  may  not  be  a  good 
characterization  of  the  system  that  generated  the  data.  An  elaboration  of  this  objection  to  the 
conventional  parametric  modeling  method,  from  a  Bayesian  point  of  view,  is  that  the  conventional 
parametric  modeling  methods  can  not  yield  "correct"  models.  That  is.  if  there  is  information  in 
the  data  to  select,  by  some  best  model  order  selection  procedure,  an  ARMAX  p,q.r  model,  then 
there  is  also  information  in  the  data  to  select  alternative  ARMAX  p'.q'.r'  models.  There  the  best 


Bayesian  model  requires  that  the  transfer  functions  computed  with  different  model  orders  be  aver¬ 
aged.  with  respect  to  the  likelihood  of  each  fitted  model  and  the  prior  probabilities  of  the  model 
orders.  Akaike  (1979a)  is  perhaps  the  first  example  of  a  Bayesian  time  series  modeling  with 
averaging  of  the  computational  results  over  models  with  different  mode)  orders.  The  smoothness 
priors  method  of  transfer  function  estimation  introduced  here  obviates  the  3  stringent  parameter 
model  order  search  problem  in  conventional  transfer  estimation  procedures.  Our  procedure  also 
uses  3  parameters.  One  is  a  1  dimensional  model  order  parameter,  the  other  two  are  smoothness 
differential  order  parameters.  We  demonstrate  that  the  values  of  those  parameters  are  not  critical 
in  determining  the  estimated  transfer  function  properties. 

The  Mayne-Firzoon  (1978)  three  stage  least  squares  (3SLS)  procedure  was  developed  to 
avoid  the  costliness  of  the  maximization  of  the  nonlinear  likelihood  for  the  transfer  function 
modeled  wirth  additive  MA  noise..  Almost  contemporaneous  alternative  linear  computational 
transfer  function  estimation  procedures  were  generalized  least  squares  (Clarke,  1967)  and  extended 
least  squares  (Panuska,  1968)  Durbin  (1961).  a  2SLS  procedure,  was  the  conceptual  predecessor 
of  the  Mayne-Firzoon  procedure.  Astrom  and  Mayne  (1982)  and  Hannan  et  al  (1986)  are  recur¬ 
sive  procedure  realizations  of  the  Mayne-Firzoon  SSLS  transfer  function  estimation  procedure. 
Other  recent  noteworthy  publications  on  or  related  to  transfer  function  estimation  include  Hannan 
and  Rissanen  (1982),  a  recursive  method  for  finding  model  orders,  and  Ljung  (1985).  a  study  of  the 
statistical  properties  of  time  and  frequency  domain  transfer  function  estimation  procedures. 

Jordinson  et  al.  (1970),  Newbold  (1973),  Wegman  (1980),  Jakeman  and  Young  (1982),  and 
Kruc  et  al.  (1982)  are  examples  of  the  literature  on  statistical  regularization  and  Bayesian 
smoothed  deconvolution  procedures  for  the  estimation  of  transfer  functions.  Applications  of  that 
literature  include  the  estimation  of  the  transfer  function  of  the  vascular  system,  applications  in 
radiology,  dispersive  relations  in  streams  etc  This  activity  is  not  summarized  here  .  Our  own 
smoothness  priors  method  is  a  variation  on  that  Bayesian  theme 
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In  our  method,  an  Mth  order  impulse  response  between  input  and  output  plus  an  Mth  order 
autoregressive  (AR)  model  for  the  additive  noise  is  assumed,  with  M  "quite  large".  This  model  is 
equivalent  to  an  ARMAX  plus  white  noise  model.  We  assume  integrated  square  zeroth  and  kth 
order  derivative  frequency  domain  smoothness  constraints  on  the  polynomial  operators.  In  the 
least  squares  framework,  the  resultant  model  strikes  a  balance  between  the  infidelity  of  the  solution 
to  the  data  and  the  infidelity  of  the  solution  to  the  smoothness  constraints.  That  balance  or  tra¬ 
deoff  is  characterized  by  one  parameter  for  each  of  four  smoothness  constraints.  In  Bayesian  ter¬ 
minology,  those  are  referred  to  as  hyperparameters,  (Lindley  and  Smith,  1972).  The  likelihood  of 
the  hyperparameters  that  characterize  the  class  of  smoothness  priors  is  maximized  to  yield  the 
best  transfer  function  model  with  the  best  data  dependent  priors. 

The  approach  taken  in  this  paper  is  an  application  of  our  frequency  domain  smoothness  pri¬ 
ors  AR  model  spectral  estimation  method,  Kitagawa  and  Gersch  (1985a).  Some  of  our  other 
time-domain  smoothness  priors  papers  on  the  modeling  of  nonstationary  time  series  are  Brotherton 
and  Gersch  (1981),  Kitagawa  (1981)  and  Kitagawa  and  Gersch  (1984,  1985b).  Additional  Bayesian 
methods  of  time  series  analysis  are  in  TIMSAC-84,  Akaike  et  al.  (1985).  Gersch  and  Kitagawa 
(1987).  is  a  review  of  our  smoothness  priors  modeling  of  time  series.  The  papers  by  Shiller  (1973) 
and  Akaike  (1979b.  1980)  are  predecessors  to  our  own  work.  In  particular.  Akaike  (1980), 
motivated  our  interest  in  this  subject.  Additional  related  work,  that  is  better  known  in  the 
statistics  literature,  are  the  methods  of  regularization.  Tikhonov  (1965).  and  the  maximized  penal¬ 
ized  likelihood  method  (I.J  Good  1970.  and  Good  and  Gaskins  1980).  Wahba  (1977), (1982), (1983) 
and  O’Sullivan  (1987). 

The  smoothness  priors  method  of  transfer  function  estimation  is  treated  in  Section  2.  An 
example  of  the  smoothness  priors  analysis  of  the  Box-Jenkins  Series  J  data  is  shown  in  Section  3. 
Studies  of  the  statistical  performance  of  the  smoothness  priors  method  and  comparisons  of  the  SP 
and  3SLS  methods  of  transfer  estimation  are  also  in  Section  3.  An  interpretation  of  the  results, 
summary  and  discussion  in  Section  4  conclude  the  paper. 


3 


2  ANALYSIS 


In  Section  2.1,  several  features  of  a  Bayesian  model  for  linear  regression  are  shown.  A 
variety  of  other  analyses  of  the  Bayesian  model  are  shown  for  example  in  Zellner  (1971)  and 
Broemeling  (1985).  The  presentation  here  uses  assumptions  on  the  priors  of  the  parameter  vector 
similar  to  those  used  earlier  in  Lindley  and  Smith  (1972)  and  Akaike  (1980).  Illustrations  of 
smoothness  priors,  a  particularization  of  the  Bayesian  linear  model  analysis,  are  shown  in  applica¬ 
tions  to  the  time  series  analysis  problems  considered  by  Whittaker  (1923)  and  Shiller  (1973)  in 
Section  2.2.  Our  transfer  function  model  is  described  in  Section  2.3.  In  contrast  with  the  time 
domain  priors  on  the  model  coefficients  in  the  Whittaker  and  Shiller  problems,  in  Section  2.4  we 
show  an  application  of  smoothness  priors  in  the  frequency  domain,  in  transfer  function  estimation. 


2.1  A  BAYESIAN  LINEAR  REGRESSION  MODEL 
Consider  first,  the  linear  regression  model 

y  =  X8  *  <  .  (2.1.1) 

In  (2.1.1),  y  =  (y, ...  ,yy).  is  an  nxl  vector  of  observations.  8  is  a  pxl  parameter  vector.  X  is  an 
nxp  known  design  matrix  and  e  is  an  i.i.d  nxl  random  vector  with  «  ~.V(0.o3/).  Then,  the  con¬ 
ditional  data  distribution  is 


In  the  stochastic  regression  problem,  6  is  considered  to  be  a  random  vector  with  distribution  *(0) 
From  Bayes  theorem,  the  posterior  distribution  of  the  parameter  vector  8  is  proportional  to  the 
product  of  the  conditional  data  distribution  (the  likelihood).  p[y  X.6,a7),  and  the  prior  distribu¬ 
tion.  x(0). 

x(8  y.o)  q  p(y  X.6.o)n(8)  .  (2  1.3) 


Let  the  prior  distribution  of  8  be  D6  -  X(D80.\iI).  In  the  application  of  Bayesian  regression 


methods  to  smoothness  priors  problems,  we  shall  consider  the  special  case  of  80  =  0  Also,  it  is  con¬ 
venient  to  introduce  the  parameterization,  A  =  t[o.  In  that  case,  the  prior  distribution  on  8  is 


\V2  I  J 

#(tf|  r,a)  =  (2*ir,)-^J  \PDtD  exp{  — ——6TDTD8\  . 

1<r 


(2-1-4) 


r  is  referred  to  as  the  hyperparameter  of  the  prior  distribution,  (Lindley  and  Smith,  1972).  In  this 
conjugate  family  Bayesian  situation,  in  which  both  the  priors  and  conditional  data  distribution  are 
normally  distributed,  the  posterior  distribution  is  also  normally  distributed,  (Zellner  1971).  The 
mean  of  that  distribution  is  easily  computed  as  the  minimiter  of 


B 

r 

f„ 

(a 

lo 

-  \rD 

8 

(2  15) 


If  t  were  known,  the  computational  problem  in  (2.1.5)  could  be  solved  by  an  ordinary  least 
squares  computation.  The  solution  for  9  is 

6  -  [atA  -  r 2Dtd]  '  XTy  (2.1.6) 

with  the  residual  sum  of  squares. 

SS£(r)  =  yry  -  ^[a^A  -  r*£>r£>]  (2-1.7) 

The  posterior  distribution  of  8.  r(0  y.r.ff)  is  a  proper  distribution,  therefore  the  likelihood 
for  the  unknow  n  parameter  r  can  be  determined  by 

£(r,<r)  =  w[8‘  y,r.ff)J8  .  (2.1.8) 

1  J  Good  (1965)  referred  to  the  maximization  of  (2  18)  as  a  Type  II  maximum  likelihood  method 
Since  x[8  y.r.a )  is  normally  distributed.  (2.1.8)  can  be  expressed  in  the  closed  form.  (Akaike  1980). 

L{r.a)  =  (2x«rs) ” rDTD  1  1  XTX  -  r 3DrD  ' 1  5expj  — -i-S5£(r)j  (2.1.9) 

The  maximum  likelihood  estimator  of  <rJ  is 
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a 1  =  SSE(t)IK  . 


(2.1.10) 


It  is  convenient  to  work  with  -2  log  likelihood.  Using  (2.1.10)  in  (2.1.9)  yields 

(2.1.11) 

-2logL(r,a)  =  NlogU  -  Nlog(SSE(r)  /  N)  -  log  XT  X  +  i*DTD\  -  log|r2DrD|  +  N. 

This  is  the  basic  relation  that  we  use  in  our  smoothness  priors  least  squares  analysis.  A  practical 
way  to  determine  the  value  of  i3  for  which  the  -21og-likeiihood  is  minimized,  is  to  compute  the 
likelihood  for  discrete  values  of  r2  and  search  the  discrete  -21og  likelihood-hyperparameter  space  for 
the  minimum.  If  there  are  more  than  say  2  hyperparameters,  it  might  be  more  expeditious  to  use 
a  gradient  search  algorithm  to  determine  the  hyperparameters  that  maximize  the  likelihood. 
Akaike  (1980)  demonstrated  the  first  practical  use  of  the  likelihood  of  the  Bayesian  model  and  the 
use  of  the  likelihood  of  the  hyperparameters,  as  a  measure  of  the  goodness  of  fit  of  a  model  to  data. 

Several  other  facets  of  stochastic  regression  may  be  of  interest.  The  solution  of  the  ordinary 
least  squares  regression  problem  in  (2.1.1)  is 

hs  -  (XTX)-'XTy  .  (2.1.12) 


Matrix  algebra  yields 


8  =  (XT X  -  i2DtD)-'(XtX8ls  4  i2DtD60). 


(2.1.13) 


That  is.  the  posterior  parameter  estimate  is  a  weighted  sum  of  the  least  squares  solution  and  the 
prior  mean.  90.  Let  8,  be  the  true  value  of  the  parameter  vector  8.  Then,  direct  evaluation  of  the 
mean  square  parameter  vector  error,  MSE(8)  =  Vot(6 )  +  E[8-8t)  T E((9-6t),  yields  the  result 

MSE(8)<MSE{eLS)  (2.1.14) 


iff 


tr[XTX)-'2tr{XTX  +  r2DrZ?)-1  -  *[9,-90)TDT[ XTX  +  T 3DTD)-*D{9,-e0)  . 

The  first  term  on  the  RHS  of  (2.1.14)  is  not  larger  than  the  LHS  of  (2.1.13).  Depending  upon  how 
close  0C  is  to  6,.  the  MSE{8)  may  or  may  not  be  less  than  MSE(8ls).  The  Bayesian  method 
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minimizes  expected  loss.  Ther  /ore  the  expected  value  of  MSE(6)  will  be  less  than  or  equal  to  the 


expected  value  of  MSE{6ls). 


2.2  SOME  EXAMPLES  OF  SMOOTHNESS  PRIORS  MODELING: 

Two  of  the  earliest  smoothness  priors  problems  are  illustrated  here.  We  refer  to  those  as  the 
W'hittaker  problem,  (Whittaker  1923),  and  the  Shiller  problem  (Shiller  1973), 

The  Whittaker  Problem:  In  the  problem  treated  by  Whittaker,  the  observations  yn,n  =  l,...,Ar 
are  given.  They  are  assumed  to  consist  of  the  sum  of  a  "smooth"  function  and  observation  noise, 

V.  -  /.  -  (2  2.1) 

The  problem  is  to  estimate  the  unknown  /n,n  =  l,...,A'.  In  a  time  series  interpretation  of  this  prob¬ 
lem.  /„.n  =  1,...,A‘  is  the  trend  of  a  nonstationary  mean  time  series.  A  typical  approach  to  this 
problem  is  to  use  a  class  of  parametric  models.  The  quality  of  the  analysis  is  completely  depen¬ 
dent  upon  the  appropriateness  of  the  assumed  model  class.  A  flexible  model  is  desirable.  In  this 
context,  Whittaker  suggested  that  the  solution  balance  a  tradeoff  of  goodness  of  fit  to  the  data  and 

goodness  of  fit  to  a  smoothness  criterion.  This  idea  was  realized  by  determining  the  /„,n=l . A’ 

to  minimize 

is  (».-/.)*  "VE  to*/.)*;  (2.2.2) 

»=I  n= I 

for  some  appropriately  chosen  smoothness  tradeoff  parameter  ^!.  In  (2.2.2)  V*/„  expresses  a  kth- 
order  difference  constraint  on  the  solution  f.  with  y/„  =  /»  -  /n-i-  'Z’fl  -  V(V/B)i  etc. 
(Whittaker’s  original  solution  was  not  expressed  in  a  Bayesian  context.  Whittaker  and  Robinson. 
1924  does  invoke  a  Bayesian  interpretation  of  this  problem.) 

The  properties  of  the  solution  to  the  problem  in  (2.2.1 )- ( 2.2 . 2 )  are  clear  If  /<’=().  /„  =  yn 
and  the  solution  is  a  replica  of  the  observations  As  /i5  becomes  increasingly  large,  the  smoothness 
constraint  dominates  the  solution  and  the  solution  satisfies  a  kth  order  constraint.  For  large  p5 
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and  k=  1.  the  solution  is  a  constant,  for  t  =  2,  it  is  a  straight  line  etc..  Whittaker  left  the  choice  of 
It*  to  the  investigator. 

From  the  Bayesian  point  of  view,  the  difference  equation  constraints  on  the  parameter  vector 
problem  are  stochastic.  That  is,  V*/»  =  w„,  with  u>„  assumed  to  be  an  i.i.d.  normally  distributed 
zero-mean  sequence  with  unknown  variance  r3  .  For  example  for  1  =  1  and  1  =  2  those  constraints 
are: 

L  =  (2.2.3) 

fn  ~  2/n-i  f n-2 

Corresponding  to  the  matrix  D  in  (2.1.6),  for  difference  orders  1  =  1  and  1  =  2  respectively,  the 
smoothness  constraints  can  be  expressed  in  terms  of  the  SxN  constraint  matrices  Dx  and  D2, 

a 

-0  fl 
1  -2  1 
1-2  1  0 

1  -2  1 

In  (2.2.4)  a  and  $  are  small  numbers  that  are  chosen  to  satisfy  initial  conditions.  (An  alternative 
to  the  ad  hockery  in  specifying  o  and  3  in  (2.2.4)  is  to  estimate  f0  and  fx  by  a  maximum  likeli¬ 
hood  method.) 

We  use  the  parameterization  n=a  t.  Therefore,  p1  has  a  noise-to-signal  interpretation. 
Larger  r  corresponds  to  smoother  trends.  For  fixed  k  and  fixed  r5  the  least  squares  solution  can  be 
expressed  in  the  form  of  (2.1.5).  The  matrix  X  and  the  parameter  vector  9  in  (2.1.5)  are  replaced 
by  the  identity  matrix  /  and  the  parameter  vector  /=(/,. ,../v).  Then  for  example,  with  1  =  2  and 
D  =  £);.  the  solution  \f„,n  =  1 . V}  satisfies 


(2.2.4) 


(2.2.5) 

From  (2. 1.6), the  solution  to  (2.2.5),  with  D=DJt  is 

/  =  | /  +  ,  (2.2.6) 

and  the  value  of  SSE(r)  is  given  by  (2.1.8)  with  0=  f  ,X -  I ,D  =  D2-  The  minimized  value  of  -21og 
likelihood  for  this  problem  is: 

(2.2.7) 

-2/oyL(r.d)  =  Slog2n  -  A7oy(-^SS£(r))  -  loglfDfDj  +  /|  -  log\  f2D2rD,i  -  A'. 

The  Shiller  Problem:  The  problem  treated  by  Shiller  is  the  estimation  of  the  distributed  lag 

(impulse  response),  given  the  jointly  stationary  time  series  observations,  {yn.zn;  n  =  l . -V}.  The 

distributed  lag  model  is, 

M 

y«  =  S  hrn*n-m  ~  <n  (2.2.8) 

m  =0 

Frequently  and  also  in  the  case  of  the  data  analyzed  by  Shiller.  econometric  data  is  short  duration. 
As  a  result,  econometricians  have  been  motivated  to  Bayesian  analyses.  They  assume  a  prior  dis¬ 
tribution  on  the  parameters  of  the  model  and  thereby  increase  the  effective  data  length.  The 
smoothness  priors  assumed  on  the  distributed  lag  coefficients  by  Shiller  were  of  the  form, 

=  u;m.  with  { wm  ,m=0 . M  ),  a  zero  mean,  normally  distributed  zero  mean  i.i.d  sequence. 

Those  are  the  same  priors  assumed  for  the  Whitaker  example.  W’e  take  h0  to  be  0  and  for  simpli¬ 
city  consider  the  initial  conditions.  z0....z,_M.  as  known.  Then,  the  computational  problem  for  the 
smoothness  priors  distributed  lag  parameters,  is  as  in  (2  1.6).  In  the  application  of  (2.1.6)  to  the 
Shiller  problem,  the  parameter  vector  6~h.  and  the  Axm  matrix  X  as  given  below  in  (2  2.9)  and 
the  matrix  D  either  D]  or  D2  as  in  (2.2.4 ) 


V 

1  \ 

0 

1 

- 

tD,\  J 

(2.2.9) 


2.3  A  TRANSFER  FUNCTION  MODEL 


Assume  that  input/output  jointly  stationary  time  series  data  zn,yn  n  =  l,,..,.V  is  observed 


Assume  that  the  output  y„  is  observed  in  the  presence  of  additive  colored  noise,  wn.  Consider 


representation  of  the  input/output  plus  noise  in  the  impulse  response  plus  colored  noise  form. 


(2  3.1a) 


u'n  S  **«**’»»- m  • 


(2.3.1b) 


For  convenience  in  (2.3.1b),  u„  is  assumed  to  be  a  Gaussian  zero-mean  uncorrelated  sequence  with 


unknown  variance  <7^.  In  (2.3.1a)  bm  is  an  impulse  response  sequence  and  u„  is  assumed  to  be  in 


AH  model  form 


Using  the  assumed  stationarity.  (2.3.1a)  yields 


U'n-j  y*-J 


(2  3  2) 


Substituting  the  expression  for  into  (2.3.1b)  yields  the  model 


y n  3Z  Vn-m 


(2  3.3) 


dm  =  bm  -  3D  b,am.  ,.  m ^  1 .... 


Equation  (2.3.3)  is  an  ARMAX  mode)  with  additive  white  noise  u„.  The  models  in 
(2.3. la), (2. 3. lb)  are  estimated  using  (2 .3.3)  and  (2  3.4)  The  infinite  order  transfer  function  model 
in  (2.3.3)  is  approximated  by  finite  transfer  function  model 

u  M 

Vn  ~  E  S  '  (2.3.5) 

m  ~  1  *rl 

with  M  assumed  to  be  “large".  (The  choice  of  M  may  be  determined  by  the  maximization  of  a 
likelihood  and  Akaike's  AIC.)  The  coefficients  em,dm  m  =  l,...,A/  are  directly  estimated  by  the 
Bayesian  procedure  described  in  the  following  section.  The  estimates  of  the  coefficients  of  the 
model  (2.3.1a)  are  then  obtained  by  the  formulas 

am  =  -cm ,  m  =  1 . M  (2.3.6) 

am  =  0,  m  =  M-rl.... 


t>m  =  dm  +  £  rn=l,...,M 

1=) 


M 

bm  =  Ea>fc»-i>  rn  =  M~  l.Af+2,... 
1  =  1 


From  (2  3.1a),  the  frequency  response  function  from  the  input  r„  to  the  output  yn  can  be 
obtained  from 


*(/)  =  £  -2 xijm) 

m- 1 


(2.3.7) 


where  t--l.  The  power  spectrum  of  the  noise  wn  is  given  by 


S(f)  = 


M 

I  1  -  £  a*.'*?  -2tti fm 

m~  1 


(2.3.8) 


where  a1  is  the  innovations  variance  of  the  estimated  model  (2.3.5) 


Identify  the  quantities  C(/)  and  D(f) 


(23.9) 


c(f)  =  1  -  E  e»«p!-2*«m/;, 

m  =* 1 


M 

0(f)  =  E  rf«e*P'-2»‘/"i; 

m*l 


Then,  a  more  convenient  form  for  the  frequency  response  function  is 


M/) 


mu 

C(t ) 


M 

E  4»«P|-2*«/m] 

m  =  I 


M 

1-  E  em«xp  -2*t/m 

**i  =  1 


(2.3.10) 


2.4  A  SMOOTHNESS  PRIORS  TRANSFER  FUNCTION  MODEL 


The  Whittaker  and  Shiller  problems,  Section  2.2,  are  examples  of  smoothness  priors  mode) 
parameter  constraints  in  the  time  domain.  Akaike  (1979b)  is  very  likely  the  first  example  of  fre¬ 
quency  domain  smoothness  priors  constraints  In  this  section,  we  employ  frequency  domain 
parameter  constraints  that  are  similar  to  those  that  were  successfully  used  in  our  smoothness  priors 
modeling  of  AR  models  for  spectral  estimation.  Kitagawa  and  Gersch  (1985a).  (Gersch  and  Kita¬ 
gawa  1984  was  an  earlier  frequency  domain  priors  version  of  the  SP  transfer  function  method  ) 


Let  Rk  and  Qk  respectively,  measures  of  the  roughness  of  the  C(/)  and  D(f)  polynomials, 
be  characterized  by  the  integrated  square  kth  derivative  of  those  operators. 


*4  =  r  Qk=  Uf. 

*  j-i/j  j-i/z  dfk 


(2-41) 


Then  using  the  definition  of  C(/),D(/),  equation  (2.3.9).  direct  evaluation  of  (2  4  1)  yields 


Rt  =  (2*)3*  E  m3‘c3  .  Qk  -  (2*)3‘  £  mud7 

m  =  1  m  -  1 


(2  4  2) 


From  the  definitions  in  (2  4.1).  large  values  of  Rk  and  Qk.  respectively  mean  an  unsmooth,  in  the 
sense  of  kth  differential,  frequency  domain  measure  of  the  r(  |  and  d(  )  polynomials  We  also 
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introduce  the  zero-tk  derivative  smoothness  constraints 


Ro  =  C(f) |  *<//=!+  r  e* 


m=  1 


(2.43) 


r»/Z  « 

Let  the  differential  orders  for  the  numerator  and  denominator  polynomials  to  be  I,,  and  Jt3  respec¬ 
tively.  With  these  "frequency  domain"  priors  we  then  have  the  constrained  least  squares  problem 
which  for  fixed  values  of  i,,tj  and  t*,  /=!,.. .,4  determines  the  {em,dm,  m  =  l,...,A/}  that  minimizes 


(2.4.4) 


N 


n  =  1 


M 

^ "  cm  y„-m 
m  =  1 


AT 


E  dm*n-m?  +  E  +  +  E  +  dl 


M 


In  (2.4.4)  j'=  1 . 4  are  the  tradeoff  parameters.  By  a  proper  choice  of  the  tradeoff  parameters, 

our  estimate  of  the  model  parameters,  (cm,dm,  m  =  l,...,M},  balance  a  tradeoff  between  infidelity  of 
the  transfer  function  solution  to  the  data  and  infidelity  to  the  smoothness  constraints. 


2.5  THE  SMOOTHNESS  PRIORS  LEAST  SQUARES  PROBLEM,  DETERMINING 
THE  TRADEOFF  PARAMETERS 


As  indicated  in  Section  2.1,  the  constrained  least  squares  problem  has  a  Bayesian  interpreta¬ 
tion  which  facilitates  the  determination  of  the  tradeoff  parameters  in  the  criterion.  W  e  apply  that 
approach  to  the  particular  smoothness  prior  transfer  function  problem  at  hand. 

In  detail,  the  minimization  of  (2.4.4)  is  equivalent  to  the  maximization  of 


(2.5.1) 


1 


N  M  M 

C X p “  E  E  cm  Vn-m  ~  E  dmIn-m 

2(7"  „  )  m - 1  i».| 


In  that  form,  the  constrained  least  squares  solution  has  a  Bayesian  interpretation  as  the  maximum 
a  posteriori  estimate  of  the  model  with  the  data  distribution 
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(2.5.2) 


P(vi  Xf'O)  =  (2»<7*)-w/*exp{  — i-  £ \yK  -  Y  'nyn-m 

2<t  *<i  «■- 1 


V 


V 


-Yd  z  '* 


and  the  prior  distribution 


*(0  r,<r)  =  (2»(t,)-a#|  0r£>!  l/2tzp\~ eT DT D$\ 

2  o3 


(2-5.3) 


where  r  denotes  the  vector  of  hyperparameters  r,,...r4,  and  0  denotes  the  model  parameters, 
cm.dm,rn  =  l,...,M.  In  (2.5.3) 


(^),/j 

C1 

rf, 

(’f +2J*,r§),/I 

,*  = 

CM 

(rf  +  A^’r2)'/2 

dM 

From  (2.5.2)  and  (2.5.3)  it  follows  that 

p(v  X,6,o)x(6  t,o)  =  (2.5.5) 

(2™2)-<a'-jm>'-  DtD\  '/^exp  —  {0-6)t(XtX  -  DtD)-'(6  -  tf)]exp  -^-55(r)j, 

9/t2  9/t* 


where 


yo 

*0 

yt-,w 

Z\M 

yi 

Vt 

yi-M 

Z2~M 

Vj 

X  = 

.  2  = 

V.v-i 

**-i 

5* 

1 

* 

N 

* 

i 

* 

y.v 

(2.5.6) 


and 


0  =  (A’TA  +  DtD)  'Xt:  . 
SS(r)  =  rrj  -  *T(ArA  *  Z?rZ?)0 


(2.5.7) 


In  (2.5.6).  the  initial  condition  data  i  =  -A/.-Af-l . 0}  are  assumed  known.  Integrating 
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(2.5.5)  with  respect  to  the  parameter  vector  yields  the  likelihood  of  the  hyperparametere 


Then 


L(t,o)  =  (2»<TJ)-*'1tDTD!,'Ji  XTX  -  DTD  -»/*exp ; — ££llL;. 


(2.5.8) 


IogZ,(r,o)  =  — -log2fr<rs  -*•  i-log’  DtD\  — -log  XTX  +  DtD\ - 55(r),  (2  5  9) 

112  ’la1  '  ' 

with  the  maximum  likelihood  estimate  of  «r5  given  by 

=  ‘  (2.5.10) 

Substituting  the  estimated  value  of  o1  from  (2.5.10)  into  (2.5.9)  yields 

logL(r.o)  =  -Alog2irffJ  -  -Mog  DtD  -  -i-logi  ATA  -  DtD  -  ,  (2  5.11) 

i  i  Z  2 

which  is  to  be  maximized  to  obtain  the  maximum  likelihood  estimates  of  r]  _j=l . 4  The  likeli¬ 

hood  for  the  hyperparameters  is  maximized  via  a  Davidon-Fletcher-Powell  gradient  search  algo¬ 
rithm  That  algorithm  is  exterior  to  a  Householder  transformation  least  squares  solution  of  the 
constrained  least  squares  problem. 

We  use  the  A1C  statistic.  (Akaike  1973). 


AlC  =  -21ogL(r.ff)  *  2 (number  of  parameters  estimated)  (2.5.12) 

=  .\{log2ircr3  -  1 )  —  log 1  DT D  -  log  XT X  -*•  DT D  ■+  2(num6er  of  parameter s  estimated). 

to  determine  the  order  M  for  the  transfer  function.  Akaike  (1980)  referred  to  (2  5.12)  as  the  likeli¬ 
hood  of  the  Bayesian  model.  The  analysis  indicated  here  is  referred  to  as  a  "quasi-Bayesian  " 
analaysis  A  more  thorough  Bayesian  solution  of  the  transfer  function  estimation  problem,  would 
require  that  priors  be  specified  on  the  model  order  M.  A  completely  orthodox  Bayesian  analysis 
would  require  priors  on  the  model  order  and  parameters  of  the  stochastic  input  to  the  system. 


3.  ANALYSIS  OF  BOX-JENKINS  SERIES  J  DATA  AND  MONTE  CARLO 


RESULTS 

In  this  section  the  results  of  the  transfer  function  analysis  of  the  Box-Jenkins  Series  J  gas  fur¬ 
nace  data  by  the  smoothness  priors  and  Box-Jenkins  methods  are  described  and  compared  Some 
properties  of  the  smoothness  priors  method  of  analysis  are  shown  Also,  we  show  the  results  of 
Monte  Carlo  studies  of  the  statistical  performance  of  the  smoothness  priors,  (SP),  method  and  the 
SSLS  asymptotically  maximum  likelihood  method  of  Mayne  and  Firzoon  ,(1978),  based  on  models 
derived  from  the  BJ  series  J  data. 

The  input  output  BJ  series  J  data  are  shown  in  Figure  1.  The  variances  of  the  input  and 
output  data  are  1  14727  and  10.25357  respectively  For  illustrative  purposes,  the  data  shown  in 
Figure  1  was  normalized  to  have  zero  mean  and  the  same  variance.  (Inverting  the  output  data 
and  superimposing  it  over  the  input  data  reveal  the  output  to  be  a  delayed-low  pass  filtered  ver¬ 
sion  of  the  input  )  The  generic  Box-Jenkins  transfer  function  model  is 

y„  =  aiy»-i  -  •  -  *  V.  i  t  -  ~  (3  1 ) 

"»  =  <■  I *•<.-!  -  -  erun.r  -  ti„ 

In  (3.1)  {z„.y„  u„.  n  =  l . .V}  are  respectively  the  observed  input  and  output  and  the  unobserved 

added  noise.  Also  in  (3.1).  (u„.  n  =  l,.....V)  is  a  normal  zero  mean  i.i.d.  random  variable  with 
variance  ct*.  The  dimensional  parameters  of  the  BJ  model  are  d- 2.  p- 2,  q- 3,  r  =  2.  The  pub¬ 
lished  vectors  of  BJ  model  coefficients  are:  a  =  (0. 57, 0  01):6  =  (0. 53, -0  37,-0. 51  );«•  =  (  1 . 53. -0.63). 
a\  =  0  05058,  (Box-Jenkins,  Section  11.4).  For  the  AIC  optimal,  k,=4,ij=2.  SP  model,  the 
dimensional  parameters  are  p  =  q-r- 4.  The  d- 0  model  is  the  AIC  best  shift  parameter  model.  (In 
this  data  example,  the  higher  order  SP  model  automatically  accounts  for  the  delay  between  input 
and  output  data,  without  requiring  an  additional  non-zero  d  parameter  )  The  SP  a.b  polynomial 
coefficients  are  a  =  (1  58824.-0.70509.-0. 13 198.0. 14861 )  6  ^  (0  17090.- 0  43813.-0  17497.0  08608). 
The  vector  of  smoothness  priors  tradeoff  parameters  was  r  (0  0088C.0  10305  0  71914.0  10686) 


Table  1  shows  the  values  of  the  A1C  for  the  differential  orders  tfci,Jfca=  for  the  order  M=4 
model 

INSERT  FIGURE  1  HERE:  BOX-JENK1NS  SERIES  J  DATA 


J 

Table  1  AIC’ 

s  d=0,  M=4.  S 

P  Model,  Pari 

imetric  in  kl  a 

nd  k2 

kl  =  2 

w 

1! 

c* 

kl  =  4 

k  1 =5 

I 

46.886 

45.697 

45.6631 

46.345 

47.375 

2 

46.436 

46.186 

45.316 

45  995 

3 

46  273 

45.031 

45.120 

45.794 

46  436 

4 

46.672 

45.013 

45  096 

45.769 

46  414 

5 

46.410 

45.072 

45  072 

45  721 

46  388 

In  Figure  2.  the  original  Box-Jenkins  Series  J  output  data,  and  (he  SP  model  tracked  output 
data  are  shown  superimposed.  A  vertical  scale  displaced  version  of  the  difference  between  the  ori¬ 
ginal  and  tracked  data  is  also  shown  in  Figure  2.  The  tracked  output  data  is  computed  by  passing 
the  input  through  the  estimated  model.  The  appearance  of  the  .V=294  tracked  data  version  of  the 
BJ  modeled  data,  incorporating  the  d  =  2  parameter,  appears  very  similar  to  that  of  the  SP 
modeled  data  and  is  not  shown  here. 


In  the  sense  of  minimum  mean  square  tracking  error,  the  performance  of  the  AIC  optimum 
SP  and  BJ  models  were  similar.  The  variances  of  the  tracking  error  for  the  BJ  and  SP  models  (the 
sums  of  squares  of  the  residuals.  SSE),  were  0.70187  and  0  68662  respectively.  The  ratio  of  the 
relative  variance  of  the  tracking  error  to  the  variance  of  the  true  output  was  0.06798  and  0.06696 
for  the  BJ  and  SP  model  respectively. 


INSERT  FIGURE  2  HERE  TRACKED  DATA 


The  impulse  response,  transfer  function  amplitude  response,  phase  response  and  power  spec¬ 
tral  densities  (psdsj  of  the  residuals  associated  with  the  BJ  and  SP  models  are  shown  in  Figure  3. 
(Compare  the  transfer  function  and  residual  spectrum  of  a  windowed  penodogram  analysis  of  the 
B-J  data,  Jenkins  and  Watts,  p-446 . )  The  residual  psds  were  computed  from  Householder 
transformation-Akaike  AIC  criterion  AR  models.  The  coefficients  of  the  corresponding  AIC  best 
AR4  models  of  the  BJ  and  SP  residuals  were  {1.53458,-0.55879,-0.21378.0.16280}  and 
{1.58329,-0.65524.-0.17195.0.17461}  respectively  with  corresponding  innovations  variances 
0  05995  and  0.05755.  In  Figure  3,  the  SP  and  BJ  modeled  psds  of  the  residuals  are  almost  identi¬ 
cal  The  AR  -  AIC  model  of  the  residual  of  the  SP  modeled  BJ  Series  J  data  shown  in  Figure  3  is 
quite  similar  in  appearance  to  that  obtained  automatically  by  the  SP  modeling  procedure  and 
computed  directly  by  equation  (2  3  8) 

Also  in  Figure  3.  after  the  first  3  lime  points,  when  the  impulse  response  for  the  BJ  model  is 
zero,  the  impulse  responses  of  the  BJ  model  and  SP  model  appear  similar.  The  SP  model  impulse 
response  is  smoother  than  the  BJ  impulse  response.  The  initial  non  zero  going  part  of  the  SP 
impulse  response  is  a  consequence  of  the  fact  that  the  optimal  SP  model  delay  parameter  d  is  zero 

The  BJ  modeled  transfer  function  and  phase  function  versus  frequency  each  have  some  rela¬ 
tively  abrupt  kinks  in  their  responses  as  compared  to  those  for  the  SP  modeled  results. 

INSERT  FIGURE  3  HERE  IMPULSE  RESPONSE.  TRANSFER  FCN  k  PHASE  A 
NOISE  PSD’s  SP  k  BJ  MODELS 

Some  comments  on  the  stationarity  of  the  Box-Jenkins  Series  J  data  are  in  order  here.  In 
Figure  2.  the  true  BJ  output  data  and  the  SP  modeled  data  appear  more  discrepant  in  the  latter 
part  than  the  earlier  part  of  the  lime  series  Also  in  Figure  2.  there  are  relatively  large  excursions 
in  the  latter  part  of  the  residual  time  series.  That  evidence  suggests  that  the  Series  J  data  is  nons- 
tationary  To  examine  that  conjecture,  we  examined  the  residuals  of  the  AR  modeled  tracked 
data,  fitted  SP  models  to  the  first  200  data  points  and  to  the  first  75  data  points  of  the  Series  J 


data  and  examined  the  tracking  behavior  of  those  models.  The  properties  of  the  BJ  series  J  data 
do  appear  to  change  slightly  after  n  =  225 

The  impulse  response,  noise  spectrum,  transfer  function  and  phase  function  for  the 
i,  =  l.fc,=  l  SP  models  for  n  =  200  and  n  =  75  data  are  shown  superimposed  in  Figure  4A  and  4B 
respectively.  As  expected  from  Figure  2.  the  properties  of  the  SP  modeled  n  =  296  and  n  =  200  series 
J  data  are  quite  similar.  The  properties  of  the  SP  modeled  n  =  296  and  n  =  75  data  are  also  quite 
similar.  The  conjecture  that  the  SP  modeling  method  might  be  reasonable  for  relatively  short 
length  data  spans  is  supported  by  the  evidence  shown  in  Figure  4.  The  apparent  property  of  the 
SP  procedure  to  yield  reliable  parameter  estimates  with  relatively  short  data  length  time  series  is  a 
consequence  of  the  assumption  of  priors  on  the  mode!  parameters.  The  priors  are  equivalent  to  the 
observation  of  additional  data. 

For  completeness,  the  a.b  polynomial  coefficients  corresponding  to  SP  M  =  4.kt  =  1  .Jt2=  1 
modeled  data  are  respectively:  {l  13620.-0.15871,-0.29041.0.13128), 

{  1.08946.-0  24032.-0.55796.0.11972}  for  the  n  =  200  data  and 

{  1.25647.-0  20495.-0  35709.0.16760},  {1.18340.-0.54000.-0.36671.0.28644}  for  the  n  =  75  data. 
The  respective  residual  variances  of  the  SP  modeled  M- 4.  n  =  200  and  n  =  75  data  point  models 
were  0  09900  and  0  04629  respectively.  The  corresponding  relative  tracking  variance  ratios  were 
0.01078  and  0  00620 

INSERT  FIGURE  4  HERE:  IMPULSE  RESPONSE  etc  n  =  200  and  n  =  75  Models 

The  effects  of  the  choice  of  mode!  order,  Af,  and  the  differential  orders  t,,4,  on  the  impulse 
response,  the  noise  spectrum  and  the  transfer  function  amplitude  and  phase  of  the  SP  model  are 
also  of  interest 

First,  to  illustrate  the  effect  of  model  order  M  on  transfer  function  model  properties,  graphs 
of  impulse  response,  amplitude  and  phase  response  are  shown  superimposed  in  Figure  5A.B.C  for 
the  optima!  SP  A/--4  model  and  the  likelihood  best  SP  models  of  orders  Af=  10.20  and  Af^30.  The 


graphical  results  for  the  higher  order  SP  models  wiggle  only  slightly  around  those  for  the  order 
Af=4  model.  Those  results  indicate  that,  on  the  provision  that  the  model  order  is  sufficiently 
large,  the  specification  of  the  order  of  the  SP  model  does  not  very  critically  influence  the  transfer 
function  characteristics.  For  completeness,  some  of  the  computed  results  for  those  models  are: 
M=  10:  k1  =  l,*3=4,A/C=47.789;  M=  20:  i,  =  2,)b2=4M/C=50.224;  Af  =  30:  i,  =  2,tj=4,A7C=56.886. 
The  similarity  in  the  appearance  of  the  model  properties  for  the  different  model  orders  is  compati¬ 
ble  with  the  similarity  in  the  values  of  the  AIC  for  the  different  models. 

The  values  of  the  Af=10  optimal  SP  model,  a  and  b  polynomial  coefficients  are: 
a  =  {1.56980.-0.70201,-0.02424.0.06248.0.01102,-0.00000,-0.00152,-0.00046.0.00004,0.0010}. 
b  =  {0.15110.-0  38420.-0.17713,-0.00672.0.03029,0.08646,0.04176.0.00577,-0  01716.-0.00717}. 
The  pattern  of  a.b  polynomial  coefficients  is  similar  for  the  larger  order  SP  models.  The  tapering 
toward  zero  values  effect  of  the  smoothness  priors  constraints  on  the  model  parameters,  particu¬ 
larly  on  the  higher  order  a  polynomial  parameters  and  the  relatively  long  tail  b  model  parameters 
helps  explain  the  similarity  of  the  M =4, 10.20  and  Af=30  model  properties.  The  b  model  parame¬ 
ters  in  the  numerator  of  the  rational  polynomial  description  of  the  model  do  not  have  as  dramatic 
an  effect  as  do  the  a  polynomial  denominator  polynomial  parameters  on  the  model  properties. 
The  long  tail  b  polynomial  parameters  and  the  short  tail  a  polynomial  parameters  are  well  approx¬ 
imated  by  the  SP  \f  =  4  model. 

For  the  purposes  of  comparison,  we  also  fitted  ordinary  least  squares,  (OLS).  models  of  ord¬ 
ers  A/=5,10  and  \t-  20  to  the  BJ  series  J  data.  Graphical  results  of  the  impulse  response,  transfer 
function  and  phase  function  for  those  models  are  shown  in  Figures  5D,E,F.  The  A/=5  LS  model 
properties  are  very  similar  to  the  optimal  SP  model  properties  (The  relative  variance  of  the  track¬ 
ing  error  was  0.06706.  The  OLS  M=  5  model  is  actually  a  superior  model  of  the  BJ  series  J  data 
than  the  original  BJ  model  )  The  computed  properties  of  the  OLS  Af=10  and  A/=20  models  wig¬ 
gle  a  lot  more  around  the  SP  A/  =  4  mode!  than  the  SP  Af  =  30  model  This  is  very  dear  evidence 
that  the  SP  model  properties  are  relatively  insensitive  to  model  order  in  comparison  with  other. 


conventional  transfer  function  modeling  methods. 


INSERT  FIGURE  5  HERE  EFFECT  OF  MODEL  ORDER  M 

In  order  to  illustrate  the  effects  of  increasing  differential  orders  kl.k }  on  transfer  function  proper¬ 
ties.  graphs  of  those  properties  are  shown  for  SP  fc,=  l,t2=l  and  fc,  =  9.1}=9  models  of  order  M  =  4 
in  Figure  6.  The  amplitude  response,  noise  spectrum  and  phase  response  of  the  higher  order 
smoothness  constraint  differential  models  are  smoother  than  those  for  the  lower  differential  order 
model.  This  behavior  could  be  anticipated  because  the  priors  are  frequency  domain  roughness  con¬ 
straints. 

INSERT  FIGURE  6  HERE:  EFFECT  OF  DIFFERENTIAL  ORDERS  kx.k2 


We  note  that  the  formula  for  the  AIC  in  (2.5  12)  can  not  be  applied  to  determine  the  best  of 
alternative  models  with  different  model  orders  M.  In  the  notation  of  (2.5.6).  the  "initial  condi¬ 
tion"  values  of  {i,,y,;t  =  l-.V/,2-Af....O}  were  assumed  known.  In  modeling  data  with  an  SP  model 

of  order  M.  we  customarily  take  {z,,y,:«'=l . M}  as  initial  conditions  and  model  the  data  on  the 

remaining  .V-.Vf  data  points.  The  likelihood  is  actually  computed  for  the  last  S-M  observations, 
( V/w-i- •  •■•>*)•  In  l^4t  case,  models  of  different  M  orders  are  modeled  on  different  data  and  it  is  not 
appropriate  to  use  the  AIC  to  distinguish  between  models.  A  formula  that  permits  the  AlC’s  of 
models  of  different  orders  to  be  compared  is, 


A/C  = 


-  V 
jV-Af 


2log-Iikclihood  -  2(num6er  of  para meters  estimated). 


(32) 


The  formula  in  (3.2)  is  reasonable  under  the  assumption  that  the  data  is  stationary,  i.e.  the  pro¬ 
perties  of  the  first  M  vales  of  the  data  do  not  change  very  much  with  different  values  of  M. 
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Monte  Carlo  Results 


Monte  Carlo  simulation  studies  were  performed  to  explore  the  statistical  performance  of  the 
SP  method  of  transfer  function  estimation.  Some  comparisons  of  the  SP  and  Mayne  and  Firzoon 
(1978)  3SLS  procedure  were  also  done.  We  chose  to  compare  the  SP  with  the  SSLS  procedure 
because  the  latter  procedure  has  asymptotic  MLE  properties  and  is  easy  to  program.  The  principal 
topics  of  interest  are  the  bias  and  MSE  parameter  estimation  properties  and  transfer  function 
confidence  interval  properties  of  the  SP  and  the  SSLS  procedures.  We  show  computational  results 
of  simulation  studies  of  the  SP  and  SSLS  models  with  additive  AR  and  additive  MA  observation 
noises. 

Consider  the  transfer  function  model 

Vn  =  “iV»-l  ~  '  '  '  +  a,1tn-p  *  M»-i  +  •  '  '  \*n-,  +  (3  3) 

f 

with  vn  an  added  noise  process.  The  MA,  noise  model  is:  t„  =  £  cmu„_m  where  (u„)  is  a  zero 

m  -  0 

r 

mean  i.i.d.  process.  The  AR,  noise  model  is  t„  =  ^2  cm t„_m  *  u„  .  where  again  (un)  is  a  zero 

m  =  1 

mean  i.i.d  process  The  AR  observation  noise  model  is  the  Box-Jenkins  model  The  MA  observa¬ 
tion  noise  model  was  used  in  Astrom  and  Bohlin  (1965).  Since  then  it  has  been  used  extensively  in 
engineering  applications. 

The  3SLS  procedure  was  developed  as  an  alternative  to  the  comptationaih  extensive  max¬ 
imum  likelihood  method  for  the  MA  observation  noise  model.  For  convenience,  the  3SLS  pro¬ 
cedure  is  as  follows: 

Let  a.b.c  denote  the  AR.  MA  and  added  MA  noise  polynomials  respectively  in  equation  (3  3) 
i)l'stng  least  squares  (LS).  fit  a  "long"  a.b  polynomial  model,  to  the  {z„.yn.  n=l.  ,V)  input- 
output  data  and  compute  the  residual  time  series. 

li)  Fix  the  orders  of  the  a.b  and  c  polynomials  to  their  final  model  orders  and  use  the  original 
input-output  data  and  the  residuals  from  stage  i)  to  estimate  the  a  b  and  c  polynomial  coefficients 


iii)  Prewhiten  the  input  and  output  data  using  the  inverse  of  the  c  polynomial  determined  in  stage 
ii)  and  estimate  the  fixed  order  a.b  polynomials  coefficients  by  LS. 

We  fit  the  SSLS  model  to  the  original  Box-Jenkins  Series  J  data  in  order  to  verify  the 
relevance  of  that  procedure  for  a  comparison  of  results  AR  observation  noise  Monte  Carlo  study. 
An  first  stage  SSLS  a,fc  polynomial  model  order  p,?  =  8  was  determined  by  trial  and  error.  A  simi¬ 
lar  procedure  for  determining  the  stage  one  model  order  was  used  in  Hannan  et  al.  (1986).  The 
SSLS  A/=4  model  parameters  were:  o=(l. 58607, -0.67426, -0.17672.0. 16647), 

6  =  (0  19765.-0  44775,-0  24883.0.17300)  The  appearance  of  the  superimposed  SP 
M  =  4.  t ,  =  1 . -  ]  and  SSLS  modeled  impulse  response,  transfer  function  and  phase  response  were 
visually  indistinguishable.  On  the  basis  of  this  evidence,  it  was  thought  reasonable  to  examine  the 
performance  of  the  SSLS  transfer  function  mode)  with  AR  observation  noise  that  was  similar  to 
the  observation  noise  in  the  BJ  series  J  data. 

The  model  that  we  used  to  synthesize  data  for  the  Monte  Carlo  simulations  is  a  variation  of 
the  model  of  the  BJ  series  J  data.  The  input  data  for  the  simulation  was  the  Box-Jenkins  Series  J 
input  data  For  the  first  set  of  trials,  the  added  noise  was  a  stochastic  version  of  an  ARt  model  of 
the  residual  noise  from  the  SP  fit  to  the  Box-Jenkins  series  J  data  The  a.b  coefficients  of  the 
noiseless  simulation  model  were  o  =  (l  66283,-0  64256.  0  50648,0  22377) 

6  =  (-0  83218.-0  47872,-0  24869.0  12831)  The  ARt  model  coefficients  were 

c  =  (  1  69069.-0  69023. -0  28565.0.022507).  <rJ  =  0.284.  The  (biased)  SP  model  parameters  fitted  to 
a  noiseless  version  of  that  data  were  o  =  (l. 73481. -0.21383.-0  92282.0  40184). 
b  =  ( -0  03550, -0  05907,0  04443.0  05015)  The  vector  of  hyperparameiers  was 

r  =  (0.000062.0.000004.0  000027.0  000003).  Such  small  values  should  not  be  surprising  because  to 
within  roundoff  errors  the  noiseless  data  is  exactly  an  ARt  AM,  model. 

Results  of  the  statistical  properties  of  25  replications  from  the  SP  and  SSLS  models  for 
n  =  296,  A/?,.  data  points  are  shown  in  Table  2  The  output  data  is  regressed  partially  upon  itself 


so  the  SSLS  procedure  as  well  as  the  SP  procedure,  will  yield  biased  coefficient  estimates.  (The 
magnitude  of  the  biases  is  model  dependent.)  The  bias  of  the  SP  modeled  parameters  is  defined 
as  the  difference  between  the  mean  SP  parameters  and  the  zero  added  noise  3SLS  model  parame¬ 
ters.  The  standard  deviation  and  bias  errors  are  comparable  for  both  the  the  SP  and  SSLS  pro¬ 
cedures.  (The  standard  deviation  of  the  SP  modeled  estimated  b  polynomial  parameters  are  actu¬ 
ally  somewhat  smaller  than  those  for  the  SSLS  procedure.) 


Table  2  AIC’s, 

SP  &  3SLS  M=4  Models 

SP  M  =  4 

SSLS  M  =  4 

param 

mean 

std.  dev 

bias 

mean 

std.  dev. 

bias 

arl 

-1.6436 

0  0541 

-0.0820 

-1  6641 

0  0554 

-0  061 5 

ar2 

-0.7746 

0  1093 

-0.5833 

-0  8395 

0  1052 

-0.6483 

ar3 

-0.1577 

0.1082 

-0.7836 

-0  0911 

0  1054 

0.8502 

ar4 

0.2040 

0.0524 

-0.2029 

0.1775 

0  0538 

-0.2294 

mal 

-0.0404 

0.0217 

-0.0050 

-0.0596 

0  0484 

0  0243 

ma2 

-0.0284 

0.0146 

0.0310 

-0.0181 

0  0893 

0  0414 

ma3 

-0  0161 

0.0126 

0.0281 

-0  0078 

0  0861 

0  0519 

mat 

-0.0192 

0.0106 

-0.0200 

-0  0511 

0.0494 

-0  1019 

Figure  T  is  an  illustration  of  the  mean  and  plus  and  minus  on  sigma  of  results  estimated 
from  the  Monte  Carlo  trials.  The  illustrations  correspond  to  the  simulation  results  reported  in 
Table  2  From  Figure  7  it  appears  that  the  overall  mean  square  error  in  transfer  function  estimate 
is  slightly  smaller  for  the  SP  than  for  the  3SLS  method.  The  similarity  of  the  SP  and  3SLS  simu¬ 
lation  results  is  compatible  with  the  similarity  of  performance  of  those  models  on  the  BJ  series  J 
data. 

INSERT  FIGURE  7  HERE  SP  AND  3SLS  TRANSFER  FCN  MEAN  AND  STD.  DEVS. 


The  order  4a. fc  polynomial  3SLS  model  tends  to  be  overparameterized  In  order  to  verify 
that  the  SSLS  statistical  results  shown  in  Table  2  were  representative,  an  ARMAX  2.2.2  model 
was  also  simulated  and  modeled  by  the  SSLS  and  SP  A/  4  model  procedures  As  before,  the  input 
was  the  Box-Jenkins  series  J  input  data  The  additive  stochastically  modeled  noise  was  an  Aft- 
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version  of  the  residual  of  the  SP  modeled  series  J  data.  The  simulation  mode!  parameters  were 
a  =  (l. 46778, -0.55192). fc  =  (-0.03586.-0.06788). e  =  (l. 58778,-0.72461),  <t3  =  0.078  The  statistical 
results  of  25  replications  of  fitting  the  SP  and  SSLS  models  to  the  trial  data  are  shown  in  Table  3. 
Here,  the  bias  of  the  SP  modeled  parameters  is  defined  as  the  difference  between  the  mean  and  *ero 
added  noise  SP  model  parameters.  The  standard  deviation  and  bias  of  the  SP  Af=4  model  param¬ 
eters  are  comparable  to  those  in  Table  2  The  standard  deviation  and  bias  of  the  2  3SLS 
model  parameters  are  similar  to  those  observed  in  Table  2  for  the  A/=4  model. 


Table  3.  AIC  SP  M=4,  <k  3‘ 

SP  M  -  4 

5LS  M=2  Models 

SSLS  M  -  2 

param 

mean 

std.  dev. 

bias 

mean 

std.  dev 

bias 

arl 

1.5846 

0.0566 

-0.1502 

-1  1562 

0.0386 

-0  1633 

ar2 

-0.7048 

0.1072 

-0.4910 

-0.0095 

0.0356 

-0  1171 

ar3 

-0.0323 

0.0984 

0.8905 

- 

- 

• 

ar4 

0.0486 

0.0415 

-0.3532 

- 

- 

- 

mal 

-0  0443 

0.0222 

-0.0088 

-0  0095 

0.0406 

0  0264 

ma2 

-0  0362 

0.0383 

0.0229 

-0.0918 

0.0436 

-0.0231 

ma3 

-0  0001 

0.0243 

-0.0451 

- 

- 

ma4 

-0  0299 

0.0236 

0.0600 

- 

We  recall  that  the  added  noise  for  the  Table  2  data  was  ARt  and  that  for  Table  3  was  AR ,. 
The  consistency  of  the  tabulated  results  for  the  SP  model  in  Tables  2  and  3  suggest  that  the  SP 
model  is  reasonably  robust  with  respect  to  noise  color  and  noise  model  order. 

Finally,  we  show  results  of  simulation  studies  with  added  A/,4  observation  noise.  The  pri¬ 
mary  objects  of  interests  in  these  computational  experiments  were  a  comparison  of  the  SP  and 
SSLS  modeling  performance  and  the  sensitivity  of  the  SP  transfer  function  modeling  method  to 
choices  of  model  order  M,  the  differential  orders  klkJ,  the  observation  noise  level  and  the  sensi¬ 
tivity  to  data  length. 

The  model  for  these  simulations  was  a  slight  variation  of  the  model  for  the  AR  observation 
noise  simulations  The  superimposed  impulse  response,  transfer  function  and  phase  function  of  the 
3SLS  and  SP  Af  =  4.A/=5  and  A/  =10  on  the  noise  free  data  were  visually  indistinguishable.  In  the 
first  stage  of  the  3SLS  method  a  model  order  of  20  was  used  The  stochastic  input  signal  was  an 


ARt  version  of  the  input  of  the  series  J  data.  The  SSLS  M=i  noise  free  data  a, 6  polynomials 
were:  a  =  { 1 .72566.-0.19127. -0.94 126.0  40697} .  6  =  {-0.03550.-0.05940,0.04412.0.05070}. 

The  observation  noise  was  an  MAS  model,  (inadvertently  MAZ,  instead  of  MAt)  The  noise 
model  parameters  were  c  =  {1.0.0.6,0.5.01}.  This  model  has  a  fairly  fiat  psd  spectrum  A  sample 
run  of  the  MAS  noise  yielded  an  ARi0  AIC-AR  model  of  that  noise. 

Monte  Carlo  trials  with  n  =  296  and  n  =  78  were  done.  Superimposed  output  and  output  plus 
noise  and  a  displaced  version  of  the  noise  are  shown  in  Figure  8  for  typical  sample  trials  of  data 
lengths  n  =  296  and  n  =  78.  Figure  8  illustrates  the  noise  level  of  the  experiments  and  how  dramati¬ 
cally  short  the  n  =  78  data  span  is. 

Differential  orders  it,  and  t,  best  SP  M-5..Vf=10  and  .W  =  20  transfer  function  models  were 
fitted  to  a  sample  trial  of  the  n  =  296  data  with  the  following  results 
M  =  h,  t,  =  l,  tj=l,  .4/C=  - 1567.545.  M  =  10,  *,  =  4.  *,=  1.  .4/0-1609  348,  M  20.  k^i.k^Z 
A1C=  -  1557  410  The  similarity  of  the  AIC  values  suggest  that  these  models  will  have  similar 
properties 

The  computed  results  of  the  M.4S  noise  simulation  runs  for  the  SSLS  M- 4. 
SP  Af=  10  Jt,  =  lJt,  =  4  and  SP  M=  20.  A,  =2Jb2  =  4.  SP  M- 20  A,  =  1  .*,  =  3.  SP  M  =  20.  A, - 3  A,=  5  and 
an  SP  Af=  20.  t,  =  2.t2-4  model  of  data  with  observation  noise  variance  9  times  that  of  the  previ¬ 
ous  three  simulation  trials  are  shown  in  Figure  9A.B  C.D.E.F.  Tabulated  values  of  the  mean  and 
standard  deviation  of  the  parameter  estimates  for  the  SSLS  and  SP  Af=10.  20  L,^2.t5^4 

simulation  trials  are  shown  in  Table  4  The  SSLS  and  SP  M=  10  results  transfer  function  and 
phase  function  results  are  reasonable  similar  The  relative  jaggedness  of  the  impulse  response  for 
the  3SLS  model  indicates  that  a  lower  order  SSLS  model  might  be  more  suitable  for  this  data 
The  SP  ,Vf- 20.  /t,>,=  4  model  results  appear  very  similar  to  the  results  for  the  10  models.  The 
results  of  the  Monte  Carlo  runs,  with  SP  models  Af=20.t,-  1.L--  3  and  M  =  20.kl  =  Z,ki=h,  (Figures 
9D.E).  are  ver>  similar  to  those  shown  for  the  20. t,  -  2.i2  -  4.  (Figure  9C),  results  Those  illus¬ 
trations  show  the  relative  insenslivity  of  SP  modeling  to  variations  in  the  differential  orders  kt,k: 


As  exacted,  the  results  in  Figure  9F  for  the  SP  A/=20  model  the  9  times  variance  trials  show 


more  dispersion  than  the  results  for  the  Figure  9C  trials.  The  graphical  results  in  Figure9  are  com¬ 
patible  with  the  performance  of  the  3SLS  and  SP  models  of  the  Box-Jenkins  series  J  data. 


Table  4.  AIC.  3SLS  \1=4  &  SP  M=10, 20,20  Models 


3SLS  M=4 

SP  M 

=  10 

SP  M 

=20 

SP  M=20,  VAR  X  9 

param 

mean 

s.d. 

mean 

s.d. 

mean 

s.d. 

mean 

s.d. 

arl 

0.7217 

0.0469 

0.6420 

0.0356 

0.6100 

steiffr ;  91 

HER3-E  9 

0.0615 

ar2 

0.0588 

-0.0109 

0.0329 

-0.0154 

BffTiTEElM 

0.0532 

ar2 

0.0490 

0.0630 

-0.0013 

0.0111 

-0.0000 

0.0068 

ar4 

-0.0041 

0.0397 

0.0031 

0.0135 

0.0002 

0  0001 

0.0005 

ar5 

- 

0.0016 

0.0075 

0.0001 

0.0001 

0.0001 

mal 

-0.0387 

0.0235 

-0.0363 

0.0210 

-0.0437 

0.0248 

-0.0547 

ma2 

-01115 

0.0498 

-0.0958 

0.0366 

-0.0906 

0.0366 

-0.0907 

|.  '  [ 

ma3 

0.0060 

0.0557 

-0.0798 

0.0318 

-0.0884 

0.0308 

-0.0675 

0.1091 

ma4 

0.0333 

-0  0719 

0.0277 

-0.0674 

0  0252 

-0.0879 

mao 

- 

-0.0467 

0.0243 

-0.0628 

0.0253 

-0.0704 

_ 

In  Table  4  only  the  results  for  the  first  5  a.b  polynomial  coefficients  are  shown.  That  is 
sufficient  to  understand  the  results  of  the  Monte  Carlo  trials.  The  standard  deviations  for  the 
SSLS  and  Af-10  SP  trials  are  comparable.  Those  results  are  also  comparable  to  those  shown  in 
Table  2  for  the  AR  noise  trials  The  a  polynomial  coefficients  for  the  SP  trials  tend  to  taper 
quickly  and  correspondingly,  the  standard  deviations  for  those  coefficients  tend  to  be  smaller  than 
the  standard  deviations  for  the  larger  valued  coefficients.  The  values  of  the  fc  polynomial 
coefficients  do  not  taper  and  correspondingly  neither  do  the  standard  deviations  of  those  coefficient 
estimates. 


The  final  Monte  Carlo  trials  arc  for  the  shorter  length  n  =  78  data.  An  exploratory  computa¬ 
tion  on  a  single  trial  of  an  ,Y=78  data  SP  Af=10  model,  (that  yields  an  ,V  =  68  data  points  for 
actual  modeling)  indicated  that  the  A,  =  2. A2  =  1  model  was  optimum  for  that  data.  No  experiment 
was  performed  for  the  SP  .V/  =  5  model  Graphical  results  for  25  SP  .V/=5,i,  =  l,t:=l  and  SP 
M  - 10. A,  =  1  models  ,V/A  noise  Monte  Carlo  trials  are  shown  in  Figure  10.  Numerical  results 

corresponding  to  those  shown  in  Figure  10  are  in  Table  5  (Only  results  for  the  first  6  polynomial 


coefficients  are  shown.)  The  standard  deviations  for  the  n  =  78  data  coefficients  are  in  general 
larger  than  those  for  the  n  =296  data  coefficients.  The  SP  A/=10  n  =  78  data  parameter  mean 
values  and  standard  deviations  indicate  the  same  kind  of  tapering  effects  as  was  observed  for  the 


n  =  296  data.  The  graphical  results  for  the  SP  transfer  function  modeled  n=296  and  n  =  78  data 
look  very  similar.  This  is  additional  evidence  to  support  the  conjecture  that  Bayesian  smoothness 
priors  transfer  function  modeling  can  yield  reliable  transfer  function  estimation  with  relatively 
short  duration  data. 


Table  5. 

AIC,  3SLS  M=4  &  SP  M=  10,20.20  Models 

SP  M  =  4 

SP  M  =  10 

param 

mean 

std.  dev. 

mean 

std.  dev. 

arl 

0.7217 

0.0848 

0.6623 

0.0816 

ar2 

0.0196 

0.0850 

0.0537 

0.0671 

ar3 

0.0382 

0.0678 

0.0186 

0.0605 

ar4 

-0.0030 

0.0331 

0.0164 

0.0318 

ar5 

-0.0151 

0.0297 

0.0153 

0.0222 

ar6 

- 

- 

0  0062 

0.0220 

mal 

-0.0448 

0.0248 

-0.0416 

0.0551 

ma2 

-0.0772 

0.0344 

-0  0971 

0.0753 

ma3 

-0.0735 

0.0335 

-0  0663 

0.0426 

ma4 

-0.0521 

0.0322 

-0  0616 

0.0352 

ma5 

-0.0512 

0.0342 

-0.0330 

0.0382 

ma6 

• 

- 

-0  0125 

0.0162 

4.  INTERPRETATION  OF  RESULTS,  SUMMARY  AND  DISCUSSION 


A  smoothness  priors  approach  to  the  problem  of  transfer  function  estimation  was  demon¬ 
strated.  The  smoothness  priors  are  stochastic  constraints  on  the  parameters  of  the  linear  model. 
The  Bayesian  smoothness  priors  procedure  is  one  possible  way  of  utilizing  the  information  in  the 
likelihood  function.  The  critical  computation  in  the  Bayesian  smoothness  priors  approach  is  the 
likelihood  of  the  Bayesian  model.  The  likelihood  is  used  as  an  objective  measure  of  the  goodness  of 
a  model  and  the  hyperparameters  which  maximize  the  likelihood  are  determined  by  a  gradient 
search  method. 

Some  of  the  consequences  of  the  smoothness  priors  approach  to  transfer  function  estimation 

are: 

1)  The  complex  multiple  mode]  orders  selection  procedures  of  other  transfer  function  estimation 
methods  are  obviated  by  the  smoothness  priors  method.  Instead,  the  specification  of  the  values  of 
model  order.  .Vf,  and  the  differential  orders,  i,, kt.  appear  not  to  be  very  critical  in  determining  the 
impulse  response,  transfer  function  and  phase  properties  of  the  model. 

2)  Least  squares  and  conventional  maximum  likelihood  are  abandoned  as  a  criterion  for  modeling 
data.  Instead,  constrained  least  squares  criterion  or  equivalently  penalized  likelihood  methods  are 
used  in  the  smoothness  prior  method. 

3)  The  smoothness  priors  model  will  in  general  not  be  as  parsimonious,  in  the  number  of  model 
parameters,  as  the  conventional  least  squares  or  maximum  likelihood  transfer  function  estimation 
methods 

4)  Smoothness  priors  modeling  will  tend  to  be  more  robust  with  respect  to  the  specification  of  the 
observation  noise  color  than  the  conventional  methods. 

5)  Smoothness  priors  modeling  will  tend  to  yield  reliable  parameter  estimates  with  shorter  length 
data  than  conventional  methods 

The  relatively  large  model  order  non-parsimoniousness  property  of  the  Bayesian  smoothness 
priors  model  can  be  justified  In  any  alternative  Bayesian  transfer  function  modeling,  the  Bayesian 


model  would  be  an  average  of  the  transfer  function  computed  by  models  of  the  different  a,b,e  poly¬ 
nomial  model  orders  weighted  by  the  likelihood  and  the  priors  on  model  orders.  The  Bayesian 
transfer  function  model  would  have  contributions  from  the  largest  a,b,c  polynomial  model  orders 
considered.  As  a  result,  the  overall  Bayesian  model  orders  would  be  the  identical  to  the  orders  of 
that  largest  orders  model.  Thus,  the  Bayesian  procedure  will  have  a,b,c  polynomial  model  orders 
larger  than  the  model  selected  say  by  the  minimum  A1C  procedure.  By  the  specification  of  a  large 
model  order  and  smoothness  priors  constraints  on  the  model  parameters,  the  smoothness  priors 
method  achieves  the  effect  of  the  alternative  Bayesian  methods  of  averaging  the  computational 
results  of  different  order  models. 

The  use  of  frequency  domain  priors  is  a  relatively  new  idea  The  class  of  frequency  domain 
priors  that  we  used  lack  a  definite  physical  interpretation.  They  seem  reasonable  because  they 
penalize  the  higher  order  polynomial  coefficients  with  increasing  weights.  As  expected,  the  SP 
model  properties  are  increasingly  smooth  with  increasing  kth  derivative  constraints.  Also,  the 
overall  optimum  solution  is  not  necessarily  the  smoothest  solution  An  important  property  of  our 
frequency  domain  priors  is  that  they  permit  the  problem  of  transfer  function  estimation  to  be  cast 
within  the  framework  of  the  linear  model. 

The  Monte  Carlo  results  suggest  that  the  smoothness  priors  method  of  transfer  function  esti¬ 
mation  achieves  comparable  statistical  performance  with  the  asymptotically  efficient  procedures. 

In  summary,  the  smoothness  priors  method  of  transfer  function  estimation  appears  to  yield 
statistical  performance  results  that  are  comparable  and  perhaps  superior  to  other  well  known 
methods  while  enjoying  a  more  forgiving  parameter  computational  search  procedure  than  the 
search  procedures  of  other  methods.  In  general,  the  results  shown  attest  to  the  flexibility  of  the 


Bayesian  modeling  approach. 
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LEGENDS 


FIGURE  1.  Box-Jenkins  Series  J  Gas  Furnace  Data 

Figure  2.  Original  and  Modeled  Data  Superimposed  and  Difference  Data  Smoothness  Priors 
Model,  n  =  296 

FIGURE  3  Impulse  Response,  Amplitude  and  Phase  Response  and  Residua)  PSD  versus  Fre¬ 
quency.  Superimposed  Box-Jenkins  and  Smoothness  Priors  Model  Results,  (Dotted  Lines). 

FIGURE  4  Superimposed  Smoothness  Priors  Impulse  Response,  Amplitude  Response,  Phase 
Response  and  Noise  Spectrum  for 

A  n  =  296  and  n  =  200  data  points,  (dotted  lines)  B:  n=296  and  n  =  75  data  points,  (dotted  lines). 

FIGURE  5.  Superimposed  Impulse  Response.  Amplitude  Response,  Phase  Response  and  Noise 
Spectrum: 

A.  SP  SA- 4  and  M-  10  models  B  SP  .V/  =  4  and  M- 20  models 
C:  SP  ,v/  =  4  and  M-ZO  models  D:  SP  A/  =  4  and  LS  A/=5  models. 

E:  SP  M= 4  and  LS  A/=10  models  F:  SP  ,Vf=4  and  LS'Af=20  models. 

FIGURE  6.  Superimposed  Smoothness  Priors  Impulse  Response.  Amplitude  Response.  Phase 
Response  and  Noise  Spectrum  for  SP  A/  =  4  A,=  l.Fj=l  and  ki=9.kJ  =  9  (dotted  lines)  models. 

FIGURE  7  Mean  and  Plus  and  Minus  One  Standard  Deviation  Impulse  Response.  Amplitude 
Response,  Phase  Response,  AR  Observation  Noise: 

A:  Smoothness  Priors  Model  B  3SLS  Model 

FIGURE  &  Input.  Output  and  Ouput  Plus  Noise: 

A  .V-296.  Input.  Noise,  Superimposed  output  and  output  plus  noise.  Superimposed  output  and 
output  plus  larger  variance  noise,  (in  descending  order). 

B  .V--78.  Superimposed  output  and  output  plus  noise.  Noise 


FIGURE  9.  Mean  and  Plus  and  Minus  One  Standard  Deviation  Impulse  Response,  Amplitude 


Response  and  Phase  Response.  MA  Observation  Noise.  .V  =  296: 

A:  SSLS  V/  =  4  Model.  B:  SP  A/=10,  i,  =  2./tJ  =  4  Model 

C:  SP  20,  *,=  1.^=3  Model.  D  SP  M=  20,  *,=  S,is=5  Model. 

E.  SP  M=20.  i,  =  2.ij  =  4  Model.  F:  SP  A/=20.  it,  =  2,jt2=4  Model.  Variance  x  9. 

FIGURE  10.  Mean  and  Plus  and  Minus  One  Standard  Deviation  Impulse  Response,  Amplitude 

Response  and  Phase  Response.  MA  Observation  Noise,  A  =  78: 

A:  SP  A/=5  Jt,  =  2.tj=l  Model.  B:  SP  M=10,  Jb,  =  2 Jb2-  I  Model. 
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